{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$ \\newcommand{\\d}{\\mathrm d}$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Convexity of Generalized Linear Models\n",
    "\n",
    "Probability density of an exponential family distribution:\n",
    "$$ p(y;\\eta) = b(y) \\exp(\\eta^TT(y) -a(\\eta)).$$\n",
    "\n",
    "We assume $\\eta \\in \\mathbb R$ is a scalar and $T(y)=y$, so that \n",
    "$$ p(y;\\eta) = b(y) \\exp(\\eta y -a(\\eta)).$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## (a)\n",
    "\n",
    "We have \n",
    "$$\\begin{align*} \n",
    "\\frac \\partial {\\partial \\eta} p(y;\\eta) &= b(y) \\exp(\\eta y-a(\\eta)) \\left(y - \\frac \\partial {\\partial \\eta} a(\\eta)\\right)\\\\\n",
    "&= y p(y;\\eta) - p(y;\\eta)\\frac \\partial {\\partial \\eta} a(\\eta)\n",
    "\\end{align*}$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Therefore \n",
    "$$ yp(y;\\eta) = \\frac \\partial {\\partial \\eta} p(y;\\eta) + p(y;\\eta)\\frac \\partial {\\partial \\eta} a(\\eta)$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Integrating this identity over the range of $y$, we find\n",
    "$$\\begin{align*} \n",
    "\\mathbb E[Y;\\eta] &=\\int yp(y,\\eta) \\d y\\\\\n",
    "&= \\int \\frac \\partial {\\partial \\eta} p(y;\\eta)\\d y + \\frac \\partial {\\partial \\eta} a(\\eta)\\int p(y;\\eta)\\d y\\\\\n",
    "&= \\frac \\partial {\\partial \\eta} \\int p(y;\\eta) \\d y + \\frac \\partial {\\partial \\eta} a(\\eta) \\cdot 1\\\\\n",
    "&= \\frac \\partial {\\partial \\eta} 1 + \\frac \\partial {\\partial \\eta} a(\\eta)\\\\\n",
    "&= \\frac \\partial {\\partial \\eta} a(\\eta)\n",
    "\\end{align*}$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In line 3 we were allowed to interchange differentiation and integral, because if $\\eta$ is contained in the interior of some compact intervall $[-A,A]$, then $0\\leq p(y;\\eta) \\leq c(y)$ for some function $c$ that only depends on $y$ and has finite integral $\\int c(y) \\d y <\\infty$. \n",
    "\n",
    "To see this, note that the factor $\\exp(-a(\\eta))$ is bounded by $\\exp(-M)$, where $M:= \\min \\{a(\\eta) \\mid \\eta \\in [-A,A]\\}$ and that $\\exp(\\eta y) \\leq \\exp(A y) $ for $y\\geq 0$ and $\\exp(\\eta y) \\leq \\exp (-Ay)$ for $y\\leq 0$.\n",
    "\n",
    "Finally note that  \n",
    "$$\\begin{align*}\n",
    "\\int_{y\\leq 0}b(y)\\exp(-A y)\\d y + \\int_{y\\geq 0}b(y)\\exp(A y)\\d y &\\leq\n",
    "\\int b(y)\\exp(-A y)\\d y + \\int b(y)\\exp(A y)\\d y\\\\\n",
    "&\\leq \\exp(a(A))\\int p(y; -A)\\d y +\\exp(a(-A)) \\int p(y; A) \\d y \\\\\n",
    "&= \\exp(a(A)) + \\exp(a(-A))\n",
    "\\end{align*}$$\n",
    "gives an upper bound that does not depend on $\\eta$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## (b)\n",
    "\n",
    "Using the derivative calculated in (a) we have \n",
    "$$\\begin{align*} \n",
    "\\frac {\\partial^2} {(\\partial \\eta)^2} p(y;\\eta) &= \\frac \\partial {\\partial \\eta} \\left(y p(y;\\eta) - p(y;\\eta)\\frac \\partial {\\partial \\eta} a(\\eta) \\right)\\\\\n",
    "&= y\\frac \\partial {\\partial \\eta}p(y;\\eta) - p(y;\\eta)\\frac {\\partial^2} {(\\partial \\eta)^2} a(\\eta) \\\\\n",
    "&= y \\left(y p(y;\\eta) - p(y;\\eta)\\frac \\partial {\\partial \\eta} a(\\eta)\\right)- p(y;\\eta)\\frac {\\partial^2} {(\\partial \\eta)^2} a(\\eta) \\\\\n",
    "\\end{align*}$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Therefore\n",
    "$$ y^2p(y;\\eta) = yp(y;\\eta)\\frac \\partial {\\partial \\eta} a(\\eta) + p(y;\\eta)\\frac {\\partial^2} {(\\partial \\eta)^2} a(\\eta)+\\frac {\\partial^2} {(\\partial \\eta)^2} p(y;\\eta)  $$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Integrating gives\n",
    "$$\\begin{align*} \n",
    "\\mathbb E[Y^2;\\eta] &=\\int y^2p(y,\\eta) \\d y\\\\\n",
    "&= \\int yp(y;\\eta)\\d y\\frac \\partial {\\partial \\eta} a(\\eta) + \\int p(y;\\eta)\\d y\\frac {\\partial^2} {(\\partial \\eta)^2} a(\\eta)+\\frac {\\partial^2} {(\\partial \\eta)^2} \\int p(y;\\eta)\\d y\\\\\n",
    "&= \\mathbb E[Y;\\eta]\\frac \\partial {\\partial \\eta} a(\\eta) + 1\\cdot\\frac {\\partial^2} {(\\partial \\eta)^2} a(\\eta) +0\\\\\n",
    "&= \\mathbb E[Y;\\eta]^2 +\\frac {\\partial^2} {(\\partial \\eta)^2} a(\\eta) \n",
    "\\end{align*}$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So\n",
    "$$\\begin{align*}\n",
    "\\text{Var}[Y;\\eta] &=   \\mathbb E[Y^2;\\eta] - \\mathbb E[Y;\\eta]^2\\\\\n",
    "&= \\frac {\\partial^2} {(\\partial \\eta)^2} a(\\eta). \n",
    "\\end{align*}$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## (c)\n",
    "\n",
    "With $\\eta = x^T\\theta$, we can write $\\ell(\\theta)$ as\n",
    "$$\\begin{align*}\n",
    "\\ell(\\theta) &= - \\log p(y;\\eta) \\\\\n",
    "&= -\\log b(y) - y x^T\\theta + a(x^T\\theta)\n",
    "\\end{align*}$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Therefore\n",
    "$$ \\d_\\theta \\ell = -yx^T + \\frac \\partial {\\partial \\eta} a(\\eta) x^T$$\n",
    "and can write the gradient as \n",
    "$$ \\nabla_\\theta \\ell(\\theta)=-x\\left(y - \\frac \\partial {\\partial \\eta} a(\\eta)\\right).$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Remark**: Note that by part (a) this means that the gradient always has the form $-x\\left(y- h_\\theta(x^T\\theta)\\right)$. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The hessian is then given by \n",
    "$$\\begin{align*}\n",
    "\\nabla^2_\\theta \\ell(\\theta) &= x \\frac {\\partial^2} {(\\partial \\eta)^2} a(\\eta)x^T\\\\\n",
    "&=\\text{Var}[Y;\\eta] xx^T,\n",
    "\\end{align*}$$\n",
    "which clearly is PSD."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
